AITopics

2505.01742

Country: Asia (0.29)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology (0.47)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceMar-7-2025

EdgeMoE: Empowering Sparse Large Language Models on Mobile Devices

Yi, Rongjie, Guo, Liwei, Wei, Shiyun, Zhou, Ao, Wang, Shangguang, Xu, Mengwei

Large language models (LLMs) such as GPTs and Mixtral-8x7B have revolutionized machine intelligence due to their exceptional abilities in generic ML tasks. Transiting LLMs from datacenters to edge devices brings benefits like better privacy and availability, but is challenged by their massive parameter size and thus unbearable runtime costs. To this end, we present EdgeMoE, an on-device inference engine for mixture-of-expert (MoE) LLMs -- a popular form of sparse LLM that scales its parameter size with almost constant computing complexity. EdgeMoE achieves both memory- and compute-efficiency by partitioning the model into the storage hierarchy: non-expert weights are held in device memory; while expert weights are held on external storage and fetched to memory only when activated. This design is motivated by a key observation that expert weights are bulky but infrequently used due to sparse activation. To further reduce the expert I/O swapping overhead, EdgeMoE incorporates two novel techniques: (1) expert-wise bitwidth adaptation that reduces the expert sizes with tolerable accuracy loss; (2) expert preloading that predicts the activated experts ahead of time and preloads it with the compute-I/O pipeline. On popular MoE LLMs and edge devices, EdgeMoE showcase significant memory savings and speedup over competitive baselines. The code is available at https://github.com/UbiquitousLearning/mllm.

arxiv preprint arxiv, edgemoe, inference, (13 more...)

2308.14352

Country:

Asia > China > Beijing > Beijing (0.05)
Europe > United Kingdom > England (0.04)
North America > United States > Virginia (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > Promising Solution (0.48)

Industry: Information Technology > Hardware (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Alvarado, Daniela, Asif, Dr. Seemal

A Framework for Controlling Multiple Industrial Robots using Mobile Applications

arXiv.org Artificial IntelligenceMar-12-2024

Purpose: Over the last few decades, the development of the hardware and software has enabled the application of advanced systems. In the robotics field, the UI design is an intriguing area to be explored due to the creation of devices with a wide range of functionalities in a reduced size. Moreover, the idea of using the same UI to control several systems arouses a great interest considering that this involves less learning effort and time for the users. Therefore, this paper will present a mobile application to control two industrial robots with four modes of operation. Design/methodology/approach: The smartphone was selected to be the interface due to its wide range of capabilities and the MIT Inventor App was used to create the application, whose environment is supported by Android smartphones. For the validation, ROS was used since it is a fundamental framework utilised in industrial robotics and the Arduino Uno was used to establish the data transmission between the smartphone and the board NVIDIA Jetson TX2. In MIT Inventor App, the graphical interface was created to visualize the options available in the app whereas two scripts in python were programmed to perform the simulations in ROS and carry out the tests. Findings: The results indicated that the use of the sliders to control the robots is more favourable than the Orientation Sensor due to the sensibility of the sensor and human limitations to hold the smartphone perfectly still. Another important finding was the limitations of the autonomous mode, in which the robot grabs an object. In this case, the configuration of the Kinect camera and the controllers has a significant impact on the success of the simulation. Finally, it was observed that the delay was appropriate despite the use of the Arduino UNO to transfer the data between the Smartphone and the Nvidia Jetson TX2.

application, controlling multiple industrial robot, robot, (14 more...)

2403.07639

Country: Europe > United Kingdom > England > Buckinghamshire > Milton Keynes (0.04)

Genre: Research Report > New Finding (0.67)

Industry: Information Technology (1.00)

Technology:

Information Technology > Communications > Mobile (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)

Yan, Minghao, Wang, Hongyi, Venkataraman, Shivaram

PolyThrottle: Energy-efficient Neural Network Inference on Edge Devices

arXiv.org Artificial IntelligenceJan-9-2024

As neural networks (NN) are deployed across diverse sectors, their energy demand correspondingly grows. While several prior works have focused on reducing energy consumption during training, the continuous operation of ML-powered systems leads to significant energy use during inference. This paper investigates how the configuration of on-device hardware--elements such as GPU, memory, and CPU frequency, often neglected in prior studies, affects energy consumption for NN inference with regular fine-tuning. We propose PolyThrottle, a solution that optimizes configurations across individual hardware components using Constrained Bayesian Optimization in an energy-conserving manner. Our empirical evaluation uncovers novel facets of the energy-performance equilibrium showing that we can save up to 36 percent of energy for popular models.

artificial intelligence, frequency, machine learning, (19 more...)

2310.19991

Country: North America > United States > Wisconsin (0.14)

Genre: Research Report (1.00)

Industry: Energy > Oil & Gas (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

arXiv.org Artificial IntelligenceDec-21-2023

ElasticTrainer: Speeding Up On-Device Training with Runtime Elastic Tensor Selection

Huang, Kai, Yang, Boyuan, Gao, Wei

On-device training is essential for neural networks (NNs) to continuously adapt to new online data, but can be time-consuming due to the device's limited computing power. To speed up on-device training, existing schemes select trainable NN portion offline or conduct unrecoverable selection at runtime, but the evolution of trainable NN portion is constrained and cannot adapt to the current need for training. Instead, runtime adaptation of on-device training should be fully elastic, i.e., every NN substructure can be freely removed from or added to the trainable NN portion at any time in training. In this paper, we present ElasticTrainer, a new technique that enforces such elasticity to achieve the required training speedup with the minimum NN accuracy loss. Experiment results show that ElasticTrainer achieves up to 3.5x more training speedup in wall-clock time and reduces energy consumption by 2x-3x more compared to the existing schemes, without noticeable accuracy loss.

elastictrainer, selection, tensor, (15 more...)

doi: 10.1145/3581791.3596852

2312.14227

Country:

Europe > Finland > Uusimaa > Helsinki (0.05)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > California (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)

Genre: Research Report > New Finding (0.34)

Industry: Information Technology (0.48)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

#artificialintelligenceOct-3-2019, 09:45:07 GMT

Inception Spotlight: New Skydio 2 Drone Powered by NVIDIA Jetson GPUs Can Track up to 10 Objects at a Time - NVIDIA Developer News Center

Redwood City, California-based Skydio and member of NVIDIA's startup accelerator, Inception, has just released the latest version of their AI capable GPU-accelerated drone, Skydio 2. Comprised of six 4K cameras, with an NVIDIA Jetson TX2 as the processor for the autonomous system, Skydio 2 is capable of flying for up to 23 minutes at a time and can be piloted by either an experienced pilot or by the AI-based system. The Jetson TX2 has 256 GPU cores and is capable of 1.3 trillion operations a second. According to the team, the drone uses nine custom deep neural networks that help the drone track up to 10 objects while traveling at speeds of 36 miles per hour. "Skydio 2 enables you to capture everything from a backyard pickup game to a downhill adventure with a single tap, the company wrote in blog post. "It builds on Skydio R1's foundation and takes it to the next level."

inception spotlight, nvidia developer news center, skydio 2, (5 more...)

Country: North America > United States > California > San Mateo County > Redwood City (0.27)

Industry: Information Technology > Hardware (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.76)
Information Technology > Communications > Social Media (0.72)

#artificialintelligenceAug-28-2019, 19:02:58 GMT

AI Helps Protect Taiwan's Endangered Leopard Cats NVIDIA Blog

There's no mistaking why the leopard cat of Taiwan got its name. While only about the size of domestic felines, it sports a beautiful, flower-spotted pattern on its fur. There's also no debate about why the leopard cat, the only remaining native wild cat species in Taiwan, is on the edge of extinction. Fewer than 500 of the leopard cats live in a natural habitat that overlaps with many development projects in the central regions of the island. In an otherwise rural area, the cats are often victims of roadkill due to increased traffic.

artificial intelligence, leopard cat, machine learning, (14 more...)

Country: Asia > Taiwan (1.00)

Industry:

Information Technology > Hardware (0.44)
Information Technology > Services (0.33)

Technology:

Information Technology > Communications > Social Media (0.40)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.33)

arXiv.org Machine LearningDec-17-2018

ECC: Energy-Constrained Deep Neural Network Compression via a Bilinear Regression Model

Yang, Haichuan, Zhu, Yuhao, Liu, Ji

Many DNN-enabled vision applications constantly operate under severe energy constraints such as unmanned aerial vehicles, Augmented Reality headsets, and smartphones. Designing DNNs that can meet a stringent energy budget is becoming increasingly important. This paper proposes ECC, a framework that compresses DNNs to meet a given energy constraint while minimizing accuracy loss. The key idea of ECC is to model the DNN energy consumption via a novel bilinear regression function. The energy estimate model allows us to formulate DNN compression as a constrained optimization that minimizes the DNN loss function over the energy constraint. The optimization problem, however, has nontrivial constraints. Therefore, existing deep learning solvers do not apply directly. We propose an optimization algorithm that combines the essence of the Alternating Direction Method of Multipliers (ADMM) framework with gradient-based learning algorithms. The algorithm decomposes the original constrained optimization into several subproblems that are solved iteratively and efficiently. ECC is also portable across different hardware platforms without requiring hardware knowledge. Experiments show that ECC achieves higher accuracy under the same or lower energy budget compared to state-of-the-art resource-constrained DNN compression techniques.

artificial intelligence, machine learning, sparsity, (16 more...)

arXiv.org Machine Learning

1812.01803

Genre: Research Report (0.50)

Industry: Information Technology (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

#artificialintelligenceAug-22-2018, 21:26:40 GMT

Popping Big Data Fallacies On the Edge

Organizations today are drowning in data. But there continues to be vigorous debate on the best way to deal with that data. While some advocate creating big data lakes to store data that will subsequently be used for training machine learning models, there's a growing chorus of voices calling for a simpler and more real-time approach. You can count Simon Crosby, CTO of SWIM.ai, among proponents for a lighter-weight and less expensive approach to data collection and analysis, at least for a certain class of real-world machine learning problems at the edge. During a recent conversation with Datanami, Crosby threw cold water on the notion that uploading data to the cloud for storage and machine learning was the best way to get value out of the morasses of data created on edge devices.

artificial intelligence, data mining, machine learning, (17 more...)

Country:

North America > United States > California > Santa Clara County > San Jose (0.05)
North America > United States > California > Santa Clara County > Palo Alto (0.05)

Industry: Information Technology (0.97)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.93)

#artificialintelligenceOct-29-2017, 22:00:25 GMT

TensorFlow Gains Hardware Support

There are a number of machine learning (ML) architectures that utilize deep neural networks (DNNs), including AlexNet, VGGNet, GoogLeNet, Inception, ResNet, FCN, and U-Net. These in turn run on frameworks like Berkeley's Caffe, Google's TensorFlow, Torch, Microsoft's Cognitive Toolkit (CNTK), and Apache's mxnet. Of course, support for these frameworks on specific hardware is required to actually run the ML applications. Each framework has advantages and disadvantages. For example, Caffe is an easy platform to start with, especially since ones of its popular uses is image recognition.

artificial intelligence, machine learning, tensorflow, (13 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.65)